301 research outputs found

    GNN-encoder: Learning a Dual-encoder Architecture via Graph Neural Networks for Passage Retrieval

    Full text link
    Recently, retrieval models based on dense representations are dominant in passage retrieval tasks, due to their outstanding ability in terms of capturing semantics of input text compared to the traditional sparse vector space models. A common practice of dense retrieval models is to exploit a dual-encoder architecture to represent a query and a passage independently. Though efficient, such a structure loses interaction between the query-passage pair, resulting in inferior accuracy. To enhance the performance of dense retrieval models without loss of efficiency, we propose a GNN-encoder model in which query (passage) information is fused into passage (query) representations via graph neural networks that are constructed by queries and their top retrieved passages. By this means, we maintain a dual-encoder structure, and retain some interaction information between query-passage pairs in their representations, which enables us to achieve both efficiency and efficacy in passage retrieval. Evaluation results indicate that our method significantly outperforms the existing models on MSMARCO, Natural Questions and TriviaQA datasets, and achieves the new state-of-the-art on these datasets.Comment: 11 pages, 6 figure

    To what extent can control policies influence the epidemic spreading? -- A data-driven analysis based on the first wave of COVID-19

    Full text link
    On May 5th, 2023, WHO declared an end to the global COVID-19 public health emergency, which means a significant transition from global critical emergency response activities to long-term sustained COVID-19 prevention and control. At this very moment, we make a comprehensive review on various control policies taken by 127 countries/territories during the first wave of COVID-19 pandemic until July 2nd, 2020, and evaluate their impacts on the epidemic dynamics in a quantitative way through both linear and nonlinear regressions. Through our analyses, the intrinsic correlations between the strength of control policies and the dynamical characteristics of COVID-19 epidemics are revealed not only for every country/territory under consideration, but also in a global view. Our results may help to design more economical and more effective preventive measures during the long-term fight against COVID-19 in the future.Comment: 17 pages, 5 figures, 2 table

    PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language Models

    Full text link
    While transformer-based pre-trained language models (PLMs) have dominated a number of NLP applications, these models are heavy to deploy and expensive to use. Therefore, effectively compressing large-scale PLMs becomes an increasingly important problem. Quantization, which represents high-precision tensors with low-bit fix-point format, is a viable solution. However, most existing quantization methods are task-specific, requiring customized training and quantization with a large number of trainable parameters on each individual task. Inspired by the observation that the over-parameterization nature of PLMs makes it possible to freeze most of the parameters during the fine-tuning stage, in this work, we propose a novel ``quantize before fine-tuning'' framework, PreQuant, that differs from both quantization-aware training and post-training quantization. PreQuant is compatible with various quantization strategies, with outlier-aware parameter-efficient fine-tuning incorporated to correct the induced quantization error. We demonstrate the effectiveness of PreQuant on the GLUE benchmark using BERT, RoBERTa, and T5. We also provide an empirical investigation into the workflow of PreQuant, which sheds light on its efficacy.Comment: Findings of ACL202
    • …
    corecore